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TERMINATION PROOFS FOR STRING REWRITING 
SYSTEMS VIA INVERSE MATCH-BOUNDS* 

Alfons Gesert, Dieter Hofbauer*, and Johannes Waldmanm 


ABSTRACT 

Annotating a letter by a number, one can record information about its history during 
a reduction. A string rewriting system is called match-bounded if there is a global 
upper bound to these numbers. In earlier papers we established match-boundedness as 
a strong sufficient criterion for both termination and preservation of regular languages. 

We show now that the string rewriting systems whose inverse (left and right hand sides 
exchanged) is match-bounded, also have exceptional properties, but slightly different 
ones. Inverse match-bounded systems effectively preserve context-free languages; their 
sets of normalizable strings and their sets of immortal strings are effectively regular. 

These sets of strings can be used to decide the normalization, the termination and the 
uniform termination problems of inverse match-bounded systems. 

We also show that the termination problem is decidable in linear time, and that a 
certain strong reachability problem is decidable, thus solving two open problems of 
McNaughton’s. 

Inverse match-boundedness, unlike match-boundedness, does not entail termination. 

Like match-bounds, inverse match-bounds prove linear derivational complexity in the 
terminating case. 

1 INTRODUCTION 

The termination and uniform termination problems are undecidable for string rewriting 
systems (also called semi-Thue systems). 

The two problems amount to the membership and emptiness problems for the set of 
immortal strings, i.e., strings that initiate an infinite derivation. For any class of string 
rewriting systems where this set is effectively a regular language, the two problems are 
decidable. This is the basis of a new automated termination criterion. 

By annotating a letter with a number, one can record information about its history during 
a reduction. A string rewriting system is called match-bounded if there is an upper bound 
to these numbers. In an earlier paper [8, 9], we showed that match-bounded string rewriting 
systems terminate, have linear derivational complexity, and preserve regular languages. A 
match-bounded system can be encoded into a deleting system [12], and the descendant 
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relation of the latter can be decomposed into components that are known to preserve regular 
languages. 

In the present article we focus on string rewriting systems whose inverse (left and right 
hand sides of the rules exchanged) is match-bounded. Such systems cover classes of sys- 
tems like McNaughton’s inhibitor systems and Ginsburg and Greibach’s terminal bounded 
grammars. We show that the set of normalizing strings is effectively regular for inverse 
match-bounded systems, whence it is decidable whether such systems are normalizing. We 
also show that certain strong rechability problems are decidable, solving an open problem of 
McNaughton’s. 

Our main achievement is to prove that the set of immortal strings is effectively regular for 
inverse match-bounded systems. As mentioned above, we thus prove that the termination 
and the uniform termination problems are decidable for inverse match-bounded systems. 
Again we use the decomposition result for deleting systems. For inverse deleting systems we 
get preservation of context-free languages. This is sufficient, as we can show, to construct 
a finite automaton that recognizes the set of immortal strings. Its membership problem is 
decidable in linear time, solving another open problem of McNaughton’s. 

Inverse match-bounded systems, unlike match-bounded systems, need not terminate. If 
they terminate, their derivational complexity is linear. 

The article is organized as follows. After recalling definitions and essential results on 
deleting and match-bounded rewriting systems in Section 3 and 4, we study inverse match- 
boundedness in Section 5 (normalization properties, preservation of context-freeness, reach- 
ability properties) and Section 6 (termination properties). Derivational complexity is inves- 
tigated in Section 7. Finally, we shortly describe an implementation of our algorithms in 
Section 8. 

We have presented some of the results reported here at the 28th International Sympo- 
sium on Mathematical Foundations of Computer Science MFCS 2003 at Bratislava, Slovak 
Republic [8]. 

2 PRELIMINARIES 

Standard notations for strings and string rewriting can be found, for instance, in [3]. A 
string rewriting system over an alphabet E is a relation R C E* x E*, inducing the rewrite 
relation — = {(x£y,xry) \ x,y E E *,(£, r) G on E*. Unless indicated otherwise, all 
rewriting systems are finite. Pairs (£, r) from R are frequently referred to as rules i — > r, 
and by lhs ( R) and rhs(R) we denote the sets of left (resp. right) hand sides of R. The 
reflexive and transitive closure of is — >* R , often abbreviated as R* , and — > R or R + 
denotes the transitive closure. An R-derivation is a (finite or infinite) sequence (xq,xi, . . . ) 
with Xi — x i+ i for all i. Define the set of immortal strings Im(R) as the set of all Xq G E* 
that initiate an infinite -R-derivation. We call R terminating on L C E* if Irn(-R) D L — 0. 
If R is terminating on E*, we say that R is terminating. In order to classify lengths of 
derivations for terminating systems, define the derivation height function modulo R on E* 
by dh^x) = max{n G N | 3y G E* : x — y}. The derivational complexity of R is defined 
as the function n i— ► max{dh^(a;) | |x| < n} on N. 

A rewriting rule £ — > r is context-free if |£| < 1, and a rewriting system is context-free if 
all its rules are. Throughout we use e for the empty string and |x| for the length of a string 
x. 
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For a relation p C A x B let p(a) = {b E B \ (a, b) E p] for a E A and p(A') = 1J aeA , p(a) 
for A' C A. The inverse of p is p~ = {(&, a) \ (a, b) E p} C B x A, and we say that p 
satisfies the property inverse P if p~ satisfies P. Define Inf(p) = {a E A \ p(a) is infinite}; 
the relation p is finitely branching if Inf(p) = 0. 

The set of descendants of a language L C E* modulo some string rewriting system R is 
R*(L). The system R is said to preserve regularity ( context-freeness ) if R*(L) is a regular 
(context-free) language whenever L is. For standard results on rational transductions we 
refer to [2], 

For a relation p C E* x E* and a set A C E let p\& = pH (A* x A*). Note the difference 
between R*\a and (R|a)* for a string rewriting system R. For R = (a — > b, b — > c} over 
E = {a,b,c} and A = {a,c}, e.g., we have (a, c) E _R*|a, but (a, c) ^ (R|a)*- 

A relation s C E* x T* is a substitution if s(e) = {e} and s(xy) = s(x)s(y ) for x,y E E*, 
so s is uniquely determined by the languages s(a) for a G E. For a family of languages C 
over r, the substitution s is an C- substitution if s(a) E C for a E E. For instance, if C is the 
family of finite (context-free) languages, then s is a finite (resp. context-free ) substitution. 
If e s(a ) for every a E E, then s is epsilon-free. Note that a finite substitution is finitely 

branching, and the same holds for the inverse of a finite and epsilon-free substitution. 

3 DELETING STRING REWRITING SYSTEMS 

In this section we shortly recall definitions and results regarding deleting string rewriting 
systems [12], a topic that can be traced back to Hibbard [11]. This class of string rewriting 
systems enjoys a strong decomposition property. As an immediate consequence, deleting sys- 
tems preserve regularity of languages, and inverse deleting systems preserve context-freeness. 
All these results will be frequently used in the sequel. For proofs and for a description of 
the decomposition algorithm we refer to [12]. 

Definition 1. A string rewriting system R over an alphabet E is > -deleting for an irreflexive 
partial ordering > on E (a precedence ) if e ^ lhs(R), and if for each rule t — > r in R and for 
each letter a in r, there is some letter b in t with b > a. The system R is deleting if it is 
>-deleting for some precedence >. 

Proposition 1 ([12]). Every deleting string rewriting system is terminating, and has linear 
derivational complexity. 

Furthermore, we have the following effective decomposition result. 

Theorem 1 ([12]). Let R be a deleting string rewriting system over E. Then there are an 
extended alphabet T D E, a finite substitution s C E* xP, and a context-free string rewriting 
system C over T such that R* = (s o C - *)|s- 

Corollary 1 ([11, 12]). Every inverse deleting string rewriting system effectively preserves 
context-free languages. 

Corollary 2 ([12]). Every deleting string rewriting system effectively preserves regularity. 

In the present paper, we will frequently refer to a slightly specialized version of the above 
theorem. 
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Corollary 3. Let R be a deleting string rewriting system over E such that e ^ rhs(R). Then 
there are an extended alphabet TDE, an epsilon-free finite substitution s C E* x T*, and an 
epsilon-free context-free substitution c C E* x T* such that 


R* = s o c . 


Note that as a direct consequence we get R * = R* — (s o c ) = co s . 

Proof. By Theorem 1 we have R* = (s o C~ *) fl (E* x E*), where s C E* x T* is a finite 
substitution, and C is a context-free string rewriting system over T. Reviewing the construc- 
tion in [12], we find that s is epsilon- free, and that no e can occur on either side of C, thus 
C* fl (E* x T*) coincides with an epsilon-free context-free substitution c C E* x T*. Note 
that C~* = C*~. □ 

Remark 1. The assumption e f rhs(i?) in Corollary 3 cannot be dropped. Consider the 
example R = {aa — > e}, and assume R* = s o c~ for substitutions s and c. We have 
c(e) = {e}, since c is a substitution. Now, (aa, e) G R* implies (aa, e) G s, thus e G s(a). 
But then we have (a, e) Gsoc", contradicting (a, e) f R*. Note that R = {a — > e} is not a 
counterexample, as in this case R* = s for the substitution s:ai->{a,e}. 

4 MATCH-BOUNDED STRING REWRITING SYSTEMS 

The theory of deleting systems can be applied to obtain results for match-bounded rewriting. 
A derivation is match-bounded if dependencies between rule applications are limited. To 
make this precise, we annotate positions in strings by natural numbers that indicate their 
match height. Positions in a reduct will get height h + 1 if the minimal height of all positions 
in the corresponding redex was h. In this section, we summarize essential results from [8, 9]. 

Given an alphabet E, define the morphisms lift c : E* — » (E x N)* for c G N by lift c : 
a i — * (a, c), base : (E x N)* — ■> E* by base : (a, c) i— > a, and height : (E x N)* — ■> N* by 
height : (a, c) i— > c. For a string rewriting system R over E with e f lhs ( R.) define the 
rewriting system 


match (A) = {£' — > lift c (r) | (£ — > r) G R, base(/) = £, 

c = 1 + min(height (£'))} 

over alphabet E x N. For instance, the system match ({a6 — >■ be}) contains the rules a^bo — > 
biCi , a 0 bi — >• biCi, a\b 0 — > b\C\, a \ b i — > b 2 c 2 , a 0 b 2 — > biCi, . . . , writing x c as abbreviation for 
(x,c). For non-empty R , the system match (A) is always infinite. 

Every mat ch(i?)- derivation corresponds to an A- derivation (i.e., for x, y G (Ex N)*, if 
x — >match(j?) V then base(a:) — base(y)) and vice versa (i.e., for v,w G E* and iG(Ex N)*, 
if v w and base(x) = v, then there is y G (Ex N)* such that base(y) = w and 
x match (r) I))- I n particular, for n G N we have R n = lift 0 o match(i?) n o base, thus 
R* = lifto o match (A)* o base. 

It is convenient to compare the height vectors of strings that have the same base. For 
u,v G (E x N)* we write u > v if base(w) = base(u) and heighten) > n height (u), where 
> n denotes the pointwise greater-or-equal ordering on N n . The relations > and — > ma tch(i?) 
commute: 
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Lemma 1 ([9]). > O *match(I?) ^match(ij) 0 

Definition 2. A string rewriting system R over E is called match-bounded for L C E* by 
c G N if e ^ lhs(i?) and max(height(x)) < c for every x G match(i?)*(lift 0 (L)). If we omit L, 
then it is understood that L = E*. 

The number max ( height (x)) in Definition 2 (and min(height(P)) in the definition of 
match (A)) denotes the maximum (minimum, respectively) over the corresponding sequences 
of heights; we set max(e) = 0, and we leave rnin(e) undefined as this case is excluded in the 
definition of match (A). Note that max(height(x)) < c is equivalent to x < lift c (base(x)). 
Obviously, a system that is match-bounded for L is also match-bounded for any subset 
of L by the same bound. Further, by Lemma 1, if R is match-bounded for L then R is 
match-bounded for R*(L), again by the same bound. 

For a match-bounded system R , the infinite system match(i?) may be replaced by a 
finite restriction. Denote by match c (i?) the restriction of match (A) to the alphabet E x 
{ 0 , 1 , • • • , c}. 

Lemma 2 ([9]). If R is match-bounded for L bye, then for n G N, R u \l = (lifto o match c (i?) n o 
base)|L, thus R*\l = (lifto ° match c (i?)* obase)|£. 

Lemma 3 ([9]). For all cGN, the system match c (i?) is deleting. 

(Dually we know that if R is deleting, then R is match-bounded.) This connection with 
deleting rewriting makes results from Section 3 applicable. By Proposition 1, match-bounded 
systems terminate and have linearly bounded derivation lengths. Further, Corollary 2 implies 
that match-bounded systems preserve regularity, therefore match-boundedness by a given 
bound is decidable. These results are stated here for later reference. 

Theorem 2 ([9]). If R is match-bounded for L , then R is terminating on L. 

Proposition 2 ([9]). Every match-bounded string rewriting system has linear derivational 
complexity. 

Theorem 3 ([9]). If R is match-bounded for a regular language L, then R*(L) is effectively 
regular. 

Theorem 4 ([9]). The following problem is decidable: 

Given: A string rewriting system R; a regular language L; c G N. 

Question: Is R match-bounded for L by c? 

5 INVERSE MATCH-BOUNDED STRING REWRITING SYSTEMS 

Our focus in this article is on string rewriting systems that are inverse match-bounded, 
i.e. , systems obtained from match-bounded ones by exchanging left and right hand sides of 
rules. Inverse match-bounded systems have both interesting applications and nice decidabil- 
ity properties. In this section, we show that the set of normalizing strings can be effectively 
determined for this class of rewriting systems, therefore normalization becomes decidable. 
Further, since context-free languages are preserved, we get decidability of a rather strong 
version of the reachability problem. 
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Example 1. Peg solitaire is a one-person game. The objective is to remove pegs from a board. 
A move consists of one peg X hopping over an adjacent peg Y, landing on the empty space 
on the opposite side of Y. After the hop, Y is removed. Peg solitaire on a one-dimensional 
board corresponds to the string rewriting system 

P = {■■□ ->• -> 

where ■ stands for “peg”, and □ for “empty”. One is interested in winning positions, i.e., 
the language of all positions that can be reduced to one single peg, which is P - *(D*BD*). 
Regularity of P~* (□*■□*) is a “folklore theorem”, see [15] for its history. In particular, 
this was shown by Ravikumar [16] who considered change bounds , a concept that is closely 
related to match bounds, but only applicable to length-preserving string rewriting systems. 
The system P is inverse match-bounded by 2, so we obtain yet another proof of that result. 
Note that in this example, P and its inverse P~ are isomorphic due to the strong symmetry 
of the system, which is uncommon. 

Example 2. McNaughton [14] introduced a class of string rewriting systems that has good 
decidability properties. A system R is called an inhibitor system , if there is a letter i f X, 
the inhibitor , such that £ € X + and r e (X U {<■})* \ X* for every rule £ — >• r in R. Each 
inhibitor system is inverse deleting for the ordering that makes the inhibitor i greater than 
every other letter, hence it is inverse match-bounded by 1. For example, the inhibitor 
system R = {baa — > aaibabba} is inverse match-bounded by 1 as height 2 does not occur in 
match 2 (R~ ) * ( { a 0 , b 0 , t 0 }*) = ({a 0 , dn M U &i{ai, bi}*a\)*. 

5.1 Normalization Properties 

A string is called normalizing if it has a descendant that is in normal form. A string rewriting 
system is called normalizing if every string is. 

Theorem 5. For an inverse match-bounded string rewriting system, the set of normalizing 
strings is effectively regular. 

Proof. The set of normal forms is NF (R) = X* \ (X* ■ lhs(-R) • X*), which is therefore regular. 
Hence by Theorem 3, the set of normalizing strings, R~*(NF(R)), is effectively regular. □ 

Note that a system R is normalizing if and only if R~*(NF(R)) = X*, so normalization 
of inverse match-bounded systems is decidable. 

Corollary 4. The following problem is decidable: 

Given: An inverse match-bounded string rewriting system R. 

Question: Is R normalizing? 

Example 3. The system R = {b 2 ab 3 — > ab 6 a} is normalizing although it admits the loop 
b 2 ab 6 — ab 6 ab 3 — > ab 4 ab 6 a [7]. We get match 2 (.R - )*({ao, &o}*) = ({«o,fro} U (6^) + ai(6f) + )*, 
hence R is inverse match-bounded by 1. The construction of R~*(NF(R)) = X* by Theorem 5 
yields another proof that R is normalizing. 
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5.2 Preserving Context-freeness 

Inverse match-bounded systems preserve context-free languages. This result will be used in 
Section 5.3 and in Section 6. 

Theorem 6. For a context-free language L, if a string rewriting system R is inverse match- 
bounded for R* (L) then R*(L) is context-free. 

Proof. Let L = R*(L), and let R~ be match-bounded for L by c. For p = lift 0 o match c (i?“)*o 
base we get R~*\i = p\z by Lemma 2. Intersection with L x L on both sides of this equation 
yields R~* D (L x L) — p D (L x L), which by R~* = R*~ is equivalent to R* fl (L x L) = 
p - fl (L x L). We have R*(L) C L, and by definition of p one proves p~(L ) C L, hence 
R*\l = P~\li so R*(L) = p~(L ) from L C L. 

The system match c ( Rr ) is deleting by Lemma 3, so match c (i? - ) - * effectively preserves 
context-freeness by Corollary 1. Also the inverse morphisms base - and liftg effectively 
preserve context-freeness, so the same is true for p - = base - o match,. (A - ) - * o liftg . □ 

Typically, one would choose a suitable regular language M D R*(L) in order to prove 
that R is inverse match-bounded for M by Theorem 4. The obvious choice M = S* leads to 
the following corollary. 

Corollary 5. Every inverse match-bounded string rewriting system effectively preserves 
context-free languages. 

Remark 2. Under the weaker assumption that R is inverse match-bounded just for the 
context-free language L we cannot guarantee that R* (. L ) is again context-free. This is shown 
by the following example. Consider the rewrite system R = {e — > abc, ac — > ca, ab — > 
ba,bc — > cb} over alphabet {a,b,c} and the language L = {e}. It is not difficult to see that 
R*(L)P\c*b*a* = {c n b n a n \ n > 0}, a language that is well-known to be not context-free. The 
system R , however, is inverse match-bounded for L (by 0, since L contains normal forms 
modulo i? - only). 

Example 4. Ginsburg and Greibach [10] have shown that terminal bounded grammars gen- 
erate context-free languages. Rules of this type of grammars have a non-empty string of 
nonterminals as left hand side, and at least one occurrence of a terminal letter in the right 
hand side. Therefore, a terminal bounded grammar R is inverse match-bounded by 1, since 
terminal letters always have height 0 in i? - -derivations. So Corollary 5 provides another 
proof of this result from [10]. 

Example 5. The system R = {ab — > da,ac — > acc} is inverse match-bounded by 1. For 
the context-free language L = {ab n c n \ n > 1} we get the context-free language R*(L ) = 
{d ni ab n2 c n | ni, n 2 > 0, ri\ + n 2 = n > 1} U {d n ac m \ m > n > 1}. 

5.3 Reachability Properties 

We have seen that inverse match-bounded string rewriting systems effectively preserve context- 
freeness. As an application, we obtain decidability of a strong version of the reachability 
problem. The reachability problem for a class C of string rewriting systems is defined as: 

GIVEN: A string rewriting system R e C; two strings x and y. 
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Question: Does x —>* R y hold? 

For the class C of terminating systems, reachability is easily decided by generating the finite 
tree of derivations starting from x and checking whether y appears in this tree. By symmetry 
one solves the problem for the class of inverse terminating systems, which includes the inverse 
match-bounded systems. This is generalized considerably by the following result. 

Theorem 7. The following problem is decidable: 

Given: An inverse match-bounded string rewriting system R; 
a context-free language L; a regular language M . 

Question: Does 3x e L 3y e M : x — >• * R y hold? 

Proof. The question is equivalent to R*(L ) D M ^ 0, so decidability is a consequence of the 
fact that R*(L) is effectively context-free by Corollary 5. □ 

Note that this cannot be extended to context-free languages M, even in the special case 
where R = 0. The reason is that the question whether LDM = 0 holds for given context-free 
languages L and M is undecidable. 

Corollary 6. The following problem is decidable: 

Given: An inverse match-bounded string rewriting system. R overE; two strings x, y G 
E*. 

Question: Does 3 u,v gE’:i — uyv hold? 

Proof. Employ Theorem 7 for L = {x} and M = E*{i/}E*. □ 

Example 6. McNaughton [14] shows that the reachability problem is decidable for the class of 
inhibitor systems, cf. Example 2. An alternative proof for a more general result is by Theo- 
rem 7. We can even affirmatively solve the following open problem posed by McNaughton [14] 
as Open Question 4: Is the following problem decidable: GIVEN: An inhibitor string rewrit- 
ing system R over E; strings x,y e E*. Question: Does 3u,v G E* : x — >* R uyv hold? 

6 DECIDABLE TERMINATION PROPERTIES 

In this section, we prove that termination is decidable for inverse match-bounded systems. 
This is done by showing that the set of immortal strings is regular for inverse deleting, and 
so for inverse match-bounded systems. Examples illustrate our approach. 

Lemma 4. Let c C E* x T* be a substitution, and let K be a regular language over T. Then 
Inf(cfl (E* x K )) is regular. 

Proof. Consider a finite automaton A with state set Q that accepts K. For p,q e Q, denote 
by L(A,p,q) the set of strings x for which there is a path p — ► q in A. We define an 
automaton B over alphabet E x [F, /} as follows. The sets of states, initial states, and final 
states of B and A coincide. For p,q E Q and a G E, B contains the transition 

• p — — ^ q iff the language c(a) fl L(A,p, q) is infinite, 



• p — g iff the language c(a) fl L(A,p, q ) is finite and non-empty. 

We claim that x = a± . . . a n E Inf(c fl (E* x K)) for a* G E and n > 0 if and only if there is 
an accepting path in B that is labelled by (ai, b\) . . . (a n , b n ) where at least one bi equals /. 
(Note that c(e) = {e}, thus e ^ Inf(c).) This can be seen as follows. 

By definition, x E Inf(cD(E* x K)) if and only if c(x)C\K is infinite. Each string y E c(x) 
is of the form y = y\ . . . y n with g* E c(a*). Thus y E c(x) fl K if and only if there is a path 
go qi q n in A accepting y, i.e., with g 0 initial and q n final. Denote by P the 

set of all such sequences go • • • q n - By construction, we have y t E c(a*) fl L(A, g*_ i, g*), and 

c(x) n K = (J (c(ai) n L(A, g 0 , gi)) • . . . • ( c(a n ) fl L(A, g n _ i, g„)) . 

qo-q n eP 

A finite union of languages is infinite if at least one summand is infinite, and a finite product 
of languages is infinite if no factor is empty and at least one factor is infinite. Thus c(x) fl K 
is infinite if and only if there is a sequence go • • • q n £ P such that each c(a*) fl L(A, g,_i, qi) is 
non-empty and at least one c(aj) flL(A, g,_ i, g*) is infinite. This is equivalent to the existence 
of a sequence b\ . . . b n E {F, I} n \ F n such that (ai, b±) . . . ( a n , b n ) E L(B). Therefore, 

Inf(c fl (E* x K)) = 7 t(L(B) \(Ex {F})*) 

where n : (E x {/, F})* — > E* is the morphism induced by 7r : (a, b) a. □ 

Lemma 5. Let E, T, A be alphabets, let c C E* x T* be a substitution, and let T C T* x A* 
be a finitely branching rational transduction such that also T~ is finitely branching. Then 
Inf(c o T ) is regular. 

Proof. We have Inf(co T) C Inf(cfl (E* x T~( A*))) because T is finitely branching, and 
Inf(c o T) D Inf(c fl (E* x T~( A*))) because T~ is finitely branching. Thus Inf(c o T) — 
Inf(c fl (E* x T“(A*))), and we conclude by Lemma 4. □ 

Remark 3. The regularity results in Lemma 4 and Lemma 5 are effective if c is an C- 
substitution for a family C of languages that is closed under intersection with regular sets, 
and for which emptiness and finiteness are decidable. This is the case, e.g., for the family of 
context-free languages, as used in the proof of Lemma 6 below. 

Lemma 6. For an inverse deleting string rewriting system R, the set Inf(F*) is effectively 
regular. 

Proof. Let R be a system over alphabet E such that F~ is deleting. If e G lhs(F) then 
Inf ( R * ) = E*, so we may assume e ^ lhs(F). Then R* = c o s~ by Corollary 3, where c is 
a context-free substitution and s is a finite epsilon-free substitution. The claim follows by 
Lemma 5 and Remark 3. □ 

Lemma 7. For an inverse deleting string rewriting system R, we have Im(F) = Inf (R*). 

Proof. A finitely branching binary relation p is well-founded if and only if p* is finitely 
branching and p + is irreflexive. Because R~ is deleting, R~ + is well-founded and hence 
irreflexive; so Im(F) = Inf (R*). □ 
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Lemma 8. Let the string rewriting system R be inverse match-bounded by c, and let S = 
match c (i2 - ) - . Then to every infinite derivation x 0 — X\ — > R ■■■ there is an infinite 
derivation x' 0 — x\ ~^ s • • • such that base(x') = Xi for all i > 0. Therefore, Im(i?) = 
base(Im(S')). 

Proof. For every finite initial segment x$ —>r X\ — ► R ■ ■ ■ — >• R Xj of the given derivation, we 
construct (by the remark before Definition 2) a derivation x 0 j —>5 Xy — • • • —>5 Xj 3 = 
lift 0 (x_j) such that base(xjj) = x t . By induction on j — i, using the implication 

Xij> '' > Xij * s Xi—.\j Xij’ * 3 X, \ _j ! X Xj—ij 

for the inductive step, we have x. (J < x i3 ’ for all 0 < % < j < j'. Define x\ as the maximum 
of the finite set {x i3 \ j > i} C base _1 (a:i) nheight _1 ({0, . . . , c}*). Then for all i > 0 we have 
x\ — x'i+i since there is an index j > i such that x\ = Xj 3 —^5 x l+ \ ^ = x' i+l . □ 

Theorem 8. For an inverse match-bounded string rewriting system R, the set Im(i?) is 
effectively regular. 

Proof. By Lemma 3, the system S = match c (i?“) - is inverse deleting. Therefore Im(S') 
is regular by Lemmas 6 and 7, thus also Im(i?) is regular as Im(i?) = base(Im(S')) by 
Lemma 8. □ 

Corollary 7. Let C be a family of languages that is closed under intersection with regular 
sets, and for which emptiness is decidable. Then the following problem is decidable: 

Given: A language L e C; an inverse match-bounded string rewriting system R. 

Question: Is R terminating on L? 

Proof. By Theorem 8, Im(i?) is regular, so emptiness of Im(i?) fl L is decidable. □ 

Corollary 8. Termination and uniform termination are decidable for the class of inverse 
match-bounded string rewriting systems. 

Proof. Choose L = {a;} to decide whether there is an infinite derivation starting from string 
x, and choose L = E* to decide uniform termination. □ 

Example 7. McNaughton [14] shows that termination and uniform termination are decidable 
for the class of inhibitor systems, see Example 2. We can give an alternative proof of this 
result by Corollary 8. 

As the membership problem for a fixed regular language is decidable in linear time, we 
obtain: 

Corollary 9. For every inverse match-bounded string rewriting system R, its termination 
problem is decidable in linear time: 

Given: A string x. 

Question: Is R terminating on x? 
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Example 8. This affirmatively solves another question posed by McNaughton [14] as Open 
Problem 2, cf. Example 2: For every inhibitor system, is its termination problem decidable 
in polynomial time? 

Exainple 9. Consider the following string rewriting systems. 

• {babaa — > aaababab} is inverse match-bounded by 2. 

• {aabaaba — > abaabaaab} is inverse match-bounded by 3. 

• {aaabbab — > abbaaabba} is inverse match-bounded by 3. 

• {caabca — > aabccaabc} is inverse match-bounded by 2. 

Each of these systems R satisfies Im(i?) = 0, hence R terminates. — The reader is invited 
to prove these systems terminating by any other method. 

In contrast to match-boundedness, inverse match-boundedness does not entail termina- 
tion. 

Example 10. The inhibitor system R = {baa — » aaibabba} from Example 2 has a loop of 
length 3 initiated by baaa. Indeed, baaa G Im(i?) = T,*{ba, baab}b* aaT,* . 

Example 11. The system R = {baaba — > aababbaab} = {l — > r} is inverse match-bounded 
by 2 and satisfies Im(i?) = E*ME* for 

M = {baa, ii, bla} U {la, lb, Ibaab, bbaab}b*l. 

For instance, laa initiates a loop of length 4. 

Example 12. The system R = {aabaaba — >• abaaabaab} = {l — > r} is inverse match-bounded 
by 3; we have Im(i?) = T,*{laa, ala, ara}E*. The string laa initiates a loop of length 3. 

Example 13. The system R = {bca{bc) 3 — > a(bc) 3 c(bc) 2 b} = {l — > r} is inverse match- 
bounded by 2 and admits a loop of length 3 (starting from bclc ). We have Im(i?) = E*ME* 
where 

M = {bclc, lca(bc) 3 , (be) 2 1, Id, Icbcl}. 

Exainple 14. The inhibitor system R = {ab — > bbtaa} is inverse match-bounded by 1; cf. 
Example 2. It admits a loop abb — > bbiaab — > bbiabbtaa. Indeed we get abb G Im(i?) = 
T,*{aab, abab, abb} E*. 

Example 15. Consider the system i? = {ah — > da,ac — > acc} over E = {a, b, c, d} from 
Example 5, which is inverse match-bounded by 1. Here we obtain Im(i?) = E*a6*cE*. 

Looping one-rule string rewriting systems exist that are not inverse match-bounded: 

Example 16. The system R = {aabb — »■ ba } is shown to be not match-bounded in [9]. Hence 
the system R~, which admits the loop bba —*r- baabb — > R - aabbabb, is not inverse match- 
bounded. 

We conclude this section with an example known as Zantema ’s Problem. Proving termi- 
nation of this system is a “modern classic” in rewriting [4, 6, 13, 17, 18, 19], as it provides 
a test case where all previous automated methods for termination proofs fail. 
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Example 17. The one-rule system 


Z = {a 2 b 2 -> b 3 a 3 } 

is inverse match-bounded by 2 and satisfies Im(Z) = 0, so it terminates. In our implementa- 
tion (Haskell code compiled with ghc), the automaton for match(Z _ )*(lift 0 (E*)), which has 
199 states, is found in about 40 CPU seconds on a 2.4-GHz Pentium. (This figure also in- 
cludes the verification of Im(Z) = 0.) The intermediate constructions according to Theorem 
1 involve much larger automata (up to 1576 states with 15999 transitions) over much larger 
alphabets (up to 283 letters). 

We remark that Z can also be proven terminating by verifying the match bound 4 for 
Z. However, this computation needs considerably more resources, as the corresponding 
intermediate automata have up to 22241 states. Even the refined method of checking the 
match bound for right hand sides of forward closures only [9] can only reduce this number 
to 11307 states. 

While we have no evidence that terminating, inverse match-bounded systems exist that 
are not match-bounded, we do have a few examples ({ babaa aaababab} from Example 9, 
{babaa — > abaabbaba}, {baaabbaa — > aaabbaaabb }) that can be shown inverse match-bounded 
by our tool (cf. Section 8), but a corresponding proof attempt for match-boundedness fails 
due to lack of memory. 

7 DERIVATION LENGTHS 

Inverse match-bounded systems R have linear size growth for strings that are not in Im(R) = 
Inf (R*). Hence terminating, inverse match-bounded systems have linear derivational com- 
plexity. 

Theorem 9. For each inverse match-bounded string rewriting system R there is a constant 
fce N such that \y\ < k ■ \x\ for all i6S*\ Inf(i?*) and y e R*(x). 

Proof. Since the morphisms base and lift preserve lengths of strings, we may assume that 
R is inverse deleting by Lemma 2 and Lemma 3. As in the proof of Lemma 6 we then have 
R* = c o s~ by Corollary 3, where c C E* x T* is a context-free substitution and s C E* x T* 
is a finite epsilon-free substitution. Further, Inf ( R * ) = Inf(c o s - ) = Inf(c D (E* x K)) as in 
the proof of Lemma 5 for K = s(E*). 

We change the automaton B constructed in the proof of Lemma 4 to an automaton B' 
over the alphabet E x (Id U {/}) as follows. In the case where c(a) fl L(A,p,q ) is finite and 
non-empty, we rather draw an arrow labelled by (a, m) where m is the length of the longest 
string in c(a) fl L(A,p } q). Now define p : E* \ Inf (A*) — > N by 

n 

M°i • • • a n) = Uiaxj m t I (ai, mi) . . . (a n , m n ) G L(B') \(Ex {/})*} 

i= 1 

One verifies for all x e E*\Inf(i?*) that p(x) is the maximal length of strings in the set R*(x). 
One possible choice of k is therefore obtained as the maximum of all second components of 
labels in B' . □ 
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Corollary 10. Every terminating, inverse match-bounded string rewriting system has linear 
derivational complexity. 

Proof. If R~ is a match-bounded system, then by Proposition 2 there is a constant k' such 
that x — y (i.e., y — x ) implies n < k! ■ \y\ for strings x and y. On the other hand, 
\y\ < k ■ |s| for some constant k by Theorem 9. Hence n < k' ■ k ■ \x\. □ 

Remark 4. Bounds for both constants k and k' can be determined effectively. We remark 
that k' can be exponential in the size of the underlying alphabet E x {0, . . . , c} in the worst 
case, assuming that Rr is match-bounded by c, see [12]. For an algorithm to compute 
constant k see the proof of Theorem 9 above. 

All terminating, inverse match-bounded examples in Section 6 have linear derivational 
complexity. In particular this is true for Example 17 ( Zantema’s Problem ), a result due to 
Tahhan-Bittar [18]. 

8 IMPLEMENTING MATCH-BOUNDS: MATCHBOX 

Our program Matchbox can verify that a given string rewriting system R is match-bounded 
by a given number for a given regular language. So it can verify inverse match-boundedness 
(for E*) as well, and it provides an “effective version 1 ' of Theorem 5. Further, Matchbox 
constructs Im(i?) according to Theorem 8. The program can be accessed via a CGI- interface 
at 


http : //theol . informatik.uni-leipzig.de/matchbox/, 

its Haskell source is available. In particular, Matchbox is able to prove termination for a 
large number of string rewriting systems for which all standard automated methods (like 
path orderings, polynomial interpretations (see [5], e.g.), and dependency pairs [1]) fail, and 
for which only complicated ad-hoc proofs were known, if any. 

9 CONCLUSION 

We showed that inverse match-bounded string rewriting systems, like match-bounded sys- 
tems, enjoy nice language preservation and decidability properties. In particular, termination 
and uniform termination are decidable for the class of inverse match-bounded systems, and 
terminating systems in this class have linear derivation lengths. 

In spite of the formal similarity of match-boundedness and inverse match-boundedness, 
the line of reasoning for similar properties is quite different. All the same, we have found no 
terminating examples that are inverse match-bounded but not match-bounded. 
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